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Abstract 

A measurement of splitting scales, as defined by the k T clustering algorithm, is presented for 
final states containing a W boson produced in proton-proton collisions at a centre-of-mass energy 
of 7 TeV. The measurement is based on the full 2010 data sample corresponding to an integrated 
luminosity of 36 pb 1 which was collected using the ATLAS detector at the CERN Large Hadron 
Collider. Cluster splitting scales are measured in events containing W bosons decaying to electrons or 
muons. The measurement comprises the four hardest splitting scales in a fe T cluster sequence of the 
hadronic activity accompanying the W boson, and ratios of these splitting scales. Backgrounds such 
as multi-jet and top-quark-pair production are subtracted and the results are corrected for detector 
effects. Predictions from various Monte Carlo event generators at particle level are compared to the 
data. Overall, reasonable agreement is found with all generators, but larger deviations between the 
predictions and the data are evident in the soft regions of the splitting scales. 
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Abstract. A measurement of splitting scales, as defined by the &t clustering algorithm, is presented for final 
states containing a W boson produced in proton-proton collisions at a centre-of-mass energy of 7 TeV. The 
measurement is based on the full 2010 data sample corresponding to an integrated luminosity of 36 pb _1 
which was collected using the ATLAS detector at the CERN Large Hadron Collider. Cluster splitting scales 
are measured in events containing W bosons decaying to electrons or muons. The measurement comprises 
the four hardest splitting scales in a &t cluster sequence of the hadronic activity accompanying the W 
boson, and ratios of these splitting scales. Backgrounds such as multi-jet and top-quark-pair production 
are subtracted and the results are corrected for detector effects. Predictions from various Monte Carlo 
event generators at particle level are compared to the data. Overall, reasonable agreement is found with 
all generators, but larger deviations between the predictions and the data are evident in the soft regions 
of the splitting scales. 



1 Introduction 

The CERN Large Hadron Collider (LHC), in addition 
to being a discovery machine, produces a wealth of data 
suitable for studies of the strong interaction. Due to the 
strongly interacting partons in the initial state and the 
large phase space available, final states often include hard 
jets arising from QCD bremsstrahlung. Discovery signals, 
on the other hand, often contain jets from quarks pro- 
duced in electroweak interactions. A robust understanding 
of QCD-initiated processes in measurement and theory is 
necessary in order to distinguish such signals from back- 
grounds. 

One critical background for searches is the VF+jets 
process in the leptonic decay mode, which provides a large 
amount of missing transverse momentum together with 
jets and a lepton. This process is a testing ground for re- 
cent progress in QCD calculations, e.g. at fixed order [1,2] 
or in combination with resummation [3-5], and it has 
been measured using many observables at both the Teva- 
tron [6,7] and the LHC [8-14]. 

In this paper the fcx jet finding algorithm [15, 16] is 
employed for a measurement of differential distributions 
of the kr splitting scales in LT+jets events. These mea- 
surements aim to provide results which can be interpreted 
particularly well in a theoretical context and improve the 
theoretical modelling of QCD effects. The measurement 
was performed independently in the electron (W — > ev) 
and muon (W — > \w) final states. Backgrounds such as 
multi-jet and top-quark pair production were subtracted 
and results were corrected for detector effects. The result- 
ing data distributions are compared to predictions from 
various Monte Carlo event generators at particle level. 



After an outline of the measurement in this section, 
the data analysis and event selection are summarised in 
Sect. 2. The Monte Carlo (MC) simulations used for the- 
ory comparisons are described in Sect. 3. Distributions at 
the detector level are displayed in Sect. 4. The procedure 
used to correct these to the particle level before any detec- 
tor effects is outlined in Sect. 5 together with a weighting 
technique used to maximise the statistical power avail- 
able, whilst minimising the systematic uncertainty arising 
from pileup. The evaluation of the systematic uncertain- 
ties is summarised in Sect. 6, and the results are shown in 
Sect. 7, followed by the conclusions in Sect. 8. 

1.1 Definition of k T splitting scales 

The fcx jet algorithm is a sequential recombination algo- 
rithm. Its splitting scales are determined by clustering ob- 
jects together according to their distance from each other. 
The inclusive fcx algorithm uses the following distance def- 
inition [15, 16]: 

AR 2 

dij = min(p Tj ,p Tj )-^-, AR 2 ^ = ( Vl - yj f + (fc - faf, 

d,B=PTi, (!) 

where the transverse momentum px, rapidity y and az- 
imuthal angle cf> of the input objects are labelled with an 
index corresponding to the i-th and j-th momentum in 
the input configuration, and B denotes a beam. These 
momenta can be determined using energy deposits in the 
calorimeter at the detector level, or hadrons at the parti- 
cle level in Monte Carlo simulation. The R parameter was 
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chosen to be R = 0.6 in this paper, which is an interme- 
diate choice between small values R ~ 0.2, whose narrow 
width minimizes the impact of pileup and the underlying 
event, and R s» 1.0, whose large width efficiently collects 
radiation. 

The clustering from the set of input momenta proceeds 
along the following lines: 

1. Calculate dij and diB for all i and j from the input 
momenta according to Eq. (1). 

2. Find their minimum. 

(a) If the minimum is a dij, combine i and j into a 
single momentum in the list of input momenta: 
Pij = Pi + Pj 

(b) If the minimum is a e^s, remove i from the input 
momenta and declare it to be a jet. 

3. Return to step 1 or stop when no particle remains. 

The observables measured are defined as the smallest 
of the square roots of the dij and diB variables (-v/oy, 
^d iB ) found at each step in the clustering sequence. To 
simplify the notation they are commonly referred to as 
the splitting scales \fd~k, which stand for the minima that 
occur when the input list proceeds from fc+1 to k momenta 
by clustering and removing in each step. For example, V^o 
is found from the last step in the clustering sequence and 
reduces to the transverse momentum of the highest-px jet. 

Figure 1 schematically displays the clustering sequence 
derived from an original input configuration of three ob- 
jects labelled pi, P2, P3 in the presence of beams B\ and 
£?2- In the first clustering step, where three objects are 
grouped into two (denoted 3 — > 2), the minimal split- 
ting scale is found between momenta P2 and p%, leading 
to c? 2 = ^23- In the second step (2 — > 1), the momentum 
Pi is closest to the beam, and thus is removed and de- 
clared a jet at the scale d\ = d\B = p\\- Ultimately, the 
third clustering (1 — > 0) has only the beam distance of 
the combined input j>2,3 remaining, leading to a scale of 
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Fig. 1. Illustration of the fcx clustering sequence starting from 
the original input configuration (three objects pi, P2, P3, and 
beams Bi, B2). At each step, fc + 1 objects are merged to fc. 



1.2 Features of the observables 

An important feature of these observables is their separa- 
tion into two regions: a "hard" one with y/d^ > 20 GeV 
which is dominated by perturbative QCD effects, and a 
"soft" one in which more phenomenological modelling as- 
pects such as hadronisation and multiple partonic inter- 
actions may exert substantial influence on theory predic- 
tions. The number of events in the hard region for high k 
is naturally low in the data sample analysed for this mea- 
surement. Thus for statistical reasons values of < k < 3 
are considered in this publication. No explicit jet require- 
ment is imposed in the event selection. 

In addition to the observables mentioned above, it is 
also interesting to study ratios of consecutive clustering 
values, y/dk+i/dk, where some experimental uncertainty 
cancellations occur, as discussed in Sect. 6. Of particular 
interest is the region where \J dk+i/dk — > 1, as it probes 
events with subsequent emissions at similar scales. Those 



events could be challenging to describe correctly for par- 
ton shower generators without matrix element corrections. 
The splitting scale ratio amounts to a normalisation of 
the splitting scale to the scale of the QCD activity in 
the "underlying process", i.e. after the clustering. To re- 
duce the influence of non-perturbative effects, each ratio 
observable \J dk+i/dk is measured with events satisfying 
yfd u > 20 GeV. 

The central idea underlying this measurement is that 
the measure of the fcx algorithm corresponds relatively 
well to the singularity structure of QCD. To illustrate this, 
the small-angle limit of the squared fcx measure is given 
in terms of the angle Oij between two momenta i and j, 
and the energy corresponding to the softer momentum, 
E it by [15]: 

p%iARl ~ Efel (2) 
Pt> - E^ B , (3) 
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while the splitting probability for a final-state branching 
into partons i and j evaluates to 

d P , 1 

d£ 4 d% m:n; /•„./•.,}«, , V ' 

in the collinear limit [17]. 

From a comparison of Eqs. (2) and (4) it can be seen 
that each step of the fcx algorithm identifies the parton 
pair which would be the most likely to have been pro- 
duced by QCD interactions. In that sense, this clustering 
sequence mimicks the reversal of the QCD evolution. 

In contrast the anti-fc t [18] algorithm cannot be used 
in the same way: its distance measure replaces all p\ by 
p^ 2 . So even though collinear branchings are still clustered 
first, the same is not true for soft emissions anymore. Thus 
the splitting structure within the anti-fct algorithm must 
be constructed via the fcx splitting algorithm [19]. 

Just like QCD matrix elements, the fcx splitting scales 
provide a unified view of initial- and final-state radiation. 
Through the combination of the distance to the beams and 
the relative distance of objects to each other, the y/d~k dis- 
tributions contain information about both the pt spectra 
and the substructure of jets. 

1.3 Existing predictions and measurements 

The fcx splittings and related distributions have attracted 
the attention of theorists, in W — > iv and similar final 
states. They can be resummed analytically at next-to- 
leading-logarithm accuracy as demonstrated for the ex- 
ample of jet production by QCD processes in hadron col- 
lisions in Refs. [20,21]. The ratio observable 2/23 defined 
by the authors is closely related to the ratio observables 
yjdk+i/dk in this analysis. Other theoretical studies may 
be found in Refs. [22,23]. 

Experimentally, these kinds of observables were mea- 
sured at LEP [24-26] using the e + e~ (Durham) fcx al- 
gorithm. Their theoretical features (resummability) were 
used in Refs. [27, 28] to determine a s with high precision. 
Related observables were also measured at HERA [29-32] . 

2 Data analysis 

2.1 The ATLAS detector 

The ATLAS detector [33] at the LHC covers nearly the 
entire solid angle around the collision point. It consists of 
an inner tracking detector surrounded by a thin supercon- 
ducting solenoid, electromagnetic and hadronic calorime- 
ters, and a muon spectrometer incorporating three large 
superconducting toroid magnets. 

The inner-detector system is immersed in a 2 T axial 
magnetic field and provides charged particle tracking in 
the range \r)\ < 2.5 . The high-granularity silicon pixel 

1 ATLAS uses a right-handed coordinate system with its ori- 
gin at the nominal interaction point (IP) in the centre of the 



detector covers the vertex region and typically provides 
three measurements per track. It is followed by the sil- 
icon microstrip tracker which usually provides four two- 
dimensional measurement points per track. These silicon 
detectors are complemented by the transition radiation 
tracker, which contributes to track reconstruction up to 
\r)\ = 2.0. The transition radiation tracker also provides 
electron identification information based on the fraction of 
hits (typically 30 in total) above a higher energy-deposit 
threshold corresponding to transition radiation. 

The calorimeter system covers the pseudorapidity range 
|?7| < 4.9. Within the region \r]\ < 3.2, electromagnetic 
calorimetry is provided by barrel and endcap high-granu- 
larity lead/liquid-argon (LAr) calorimeters, with an addi- 
tional thin LAr presampler covering \rj\ < 1.8 to correct 
for energy loss in material upstream of the calorimeter. 
Hadronic calorimetry is provided by a steel/scintillator- 
tile calorimeter, segmented radially into three barrel struc- 
tures within 1 77 1 < 1.7, and two copper/LAr hadronic end- 
cap calorimeters. The solid angle coverage is completed 
with forward copper/LAr and tungsten/LAr calorimeter 
modules optimised for electromagnetic and hadronic mea- 
surements respectively. 

The muon spectrometer comprises separate trigger and 
high-precision tracking chambers measuring the deflection 
of muons in a magnetic field generated by superconduct- 
ing air-core toroids. The precision chamber system covers 
the region |7y| < 2.7 with three layers of monitored drift 
tubes, complemented by cathode strip chambers in the for- 
ward region, where the background is highest. The muon 
trigger system covers the range < 2.4 with resistive 
plate chambers in the barrel, and thin gap chambers in 
the endcap regions. 

A three-level trigger system is used to select interesting 
events [34]. The Level- 1 trigger is implemented in hard- 
ware and uses a subset of detector information to reduce 
the event rate to a design value of at most 75 kHz. This 
is followed by two software-based trigger levels which to- 
gether reduce the event rate to about 200 Hz. 



2.2 Event selection 

The selection of W events is based on the criteria de- 
scribed in Refs. [13, 35] and summarised briefly below. 

2.2.1 Data sample and trigger 

The entire 2010 data sample at y/s = 7 TeV was used, cor- 
responding to an integrated luminosity of approximately 
36 pb _1 . The 2010 data sample was chosen due to the 
low pileup conditions during data taking, where the mean 

detector and the z-axis along the beam pipe. The x-axis points 
from the IP to the centre of the LHC ring, and the y-axis 
points upward. Cylindrical coordinates (r, <f>) are used in the 
transverse plane, (j> being the azimuthal angle around the beam 
pipe. The pseudorapidity is defined in terms of the angle 9 as 
j) = -lntan(6>/2). 
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number of interactions per bunch crossing was at most 2.3 
during that period. In the W — > [iv analysis, the first few 
pb _1 were excluded to restrict to a data sample of events 
recorded with a uniform trigger configuration and optimal 
detector performance. 

Single-lepton triggers were used to retain W — » Iv can- 
didate events. For the electron channel a trigger threshold 
of 14 GeV for early data-taking periods and 15 GeV for 
later data-taking periods was applied. For the muon chan- 
nel a trigger threshold of 13 GeV was applied. All relevant 
detector components were required to be fully operational 
during the data taking. Events with at least one recon- 
structed interaction vertex within 200 mm of the inter- 
action point in the z direction and having at least three 
associated tracks were considered. The number of recon- 
structed vertices reflects the pileup conditions and, in both 
channels, was used to reweight the MC simulation to im- 
prove its modelling of the pileup conditions observed in 
data. The number of reconstructed vertices was also used 
to estimate the uncertainty due to possible mismodelling 
of the pileup. 

2.2.2 Electron selection 

Clusters formed from energy depositions in the electro- 
magnetic calorimeter were required to have matched tracks, 
with the further requirement that the cluster shapes are 
consistent with electromagnetic showers initiated by elec- 
trons. On top of the tight identification criteria, a calori- 
meter-based isolation requirement for the electron was 
applied to further reduce the multi-jet background. Ad- 
ditional requirements were applied to remove electrons 
falling into calorimeter regions with non-operational LAr 
readout. The kinematic requirements on the electron can- 
didates included a transverse momentum requirement p^ > 
20 GeV and pseudorapidity |r/ | < 2.47 with removal of the 
transition region 1.37 < \rf\ < 1.52 between the calorime- 
ter modules. Exactly one of these selected electrons was 
required for the W — > ev selection. In constructing the 
fcx cluster sequence, clusters of calorimeter cells included 
in a reconstructed jet within AR = 0.3 of the electron 
candidate were removed from the input configuration. 

2.2.3 Muon selection 

Muon candidates were required to have tracks reconstruc- 
ted in both the muon spectrometer and inner detector, 
with pfj, above 20 GeV and pseudorapidity \r] \ < 2.4. Re- 
quirements on the number of hits used to reconstruct the 
track in the inner detector were applied, and the muon's 
point of closest approach to the primary vertex was re- 
quired to be displaced in z by less than 10 mm. Track- 
based isolation requirements were also imposed on the re- 
constructed muon. At least one muon was required for the 
W — > \w selection. To retain consistency with the accep- 
tance in the electron channel, when constructing the 
cluster sequence, clusters of calorimeter cells falling close 
to the muon candidate were removed from the input con- 
figuration as in the electron selection. 



2.2.4 Selection of W candidate events and construction of 
observables 

The W — > Iv event selection required that the magni- 
tude of the missing transverse momentum, E!f lss [36], be 
greater than 25 GeV. The reconstructed transverse mass 
obtained from the lepton transverse momentum and 

E™ lss vectors was required to fulfil 

m% = sj 2(p e T E™ iss - p{ ■ E™ iss ) > 40 GeV. No require- 
ments were made with respect to the number of recon- 
structed jets in the event. 

The observables defined in Sect. 1.1 were constructed 
using calorimeter energy clusters within a pseudorapidity 
range of |?7 cl | < 4.9. The clusters were seeded by calorime- 
ter cells with energies at least 4<r above the noise level. 
The seeds were then iteratively extended by including all 
neighbouring cells with energies at least 2a above the noise 
level. The cell clustering was finalised by the inclusion of 
the outer perimeter cells around the cluster. The so-called 
topological clusters that resulted were calibrated to the 
hadronic energy scale [37, 38] , by applying weights to ac- 
count for calorimeter non-compensation, energy lost up- 
stream of the calorimeters and noise threshold effects. 

2.3 Background treatment 

The contributions of electroweak backgrounds (Z — > U, 
W — )■ tv and diboson production) , as well as tt and single- 
top-quark production, to both channels were estimated 
using the MC simulation. The absolute normalisation was 
derived using the total theoretical cross sections and cor- 
rected using the acceptance and efficiency losses of the 
event selection. The shape and normalisation of the dis- 
tributions of various observables for the multi-jet back- 
ground were determined using data-driven methods in both 
analysis channels. For the W — > ev selection, the back- 
ground shape was obtained from data by reversing cer- 
tain calorimeter-based electron identification criteria to 
produce a multi-jet-enriched sample. Similarly, to esti- 
mate the multi-jet contribution to W — ¥ [iv, the back- 
ground shape was obtained from data by inverting the 
requirements on the muon transverse impact parameter 
and its significance. These multi-jet enriched samples pro- 
vided the shapes of the distributions of multi-jet back- 
ground observables. The normalisation of the multi-jet 
background was determined by fitting a linear combina- 
tion of the multi-jet and leptonic i?™ lss shapes to the 
observed E™ lss distribution, following the procedures de- 
scribed in Refs. [13,35]. The total contribution of back- 
ground to the signal was thus estimated to be 5% for the 
W — > ev analysis and 9% for the W — > \iv analysis. 

3 Monte Carlo simulations 

All detect or- level studies and the extraction of particle- 
level distributions involved two signal MC generators, Alp- 
gen+Herwig and Sherpa. Alpgen v2.13 [39], a matrix- 
element (ME) generator, was interfaced to Herwig 
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v6.510 [40] for parton showering (PS) and hadronisation, 
and to Jimmy v4.31 [41] for multiple parton interactions. 
The MLM [22] matching scheme was used to combine W- 
boson production samples having up to five partons with 
the parton shower, with the matching scale set at 20 GeV. 
Sherpa vl.3.1 [42] was used to generate an alternative 
signal sample of events with W + jets, using a ME+PS 
merging approach [23] to prevent double counting from 
the parton shower, and extending the original CKKW 
method [43] by taking into account truncated shower emis- 
sions. Up to five partons were generated in the ME and 
the matching scale was set to 30 GeV. 

The single-top-quark background events were gener- 
ated at next-to-leading-order (NLO) accuracy using the 
Mc@Nlo v3.3.1 [44] generator. Mc©Nlo was interfaced 
to Herwig and Jimmy. The Powheg vl.01 [45] genera- 
tor, interfaced to Pythia6 v6.421 [46], was used to simu- 
late the tt background. The background from diboson pro- 
duction was generated using Herwig. Backgrounds from 
inclusive Z production were simulated using Pythia6. 

Three sets of parton density functions (PDFs) were 
used in these MC samples: CTEQ6L1 [47] for the Alpgen 
samples and the parton showering and underlying event in 
the Powheg samples interfaced to Pythia6; MRST 2007 
LO* [48] for Pythia6 and Herwig; and CTEQ6.6M [49] 
for Mc@Nlo, Sherpa, and the NLO matrix element cal- 
culations in Powheg. The underlying event tunes were 
AUET1 [50] for the Herwig, Alpgen, and Mc@Nlo 
samples, and AMBT1 [51] for the Pythia6 and Powheg 
samples. The samples generated with Sherpa used the 
default underlying event tune. 

Each generated event was passed through the standard 
ATLAS detector simulation [52], based on Geant4 [53]. 
The MC events were reconstructed and analysed using 
the same software chain as applied to the data. The re- 
sulting MC predictions for the samples were normalised 
to their respective theoretical cross sections calculated at 
NLO [13], with the exception of the W and Z samples 
which were normalised to NNLO [54], and the multi-jet 
background which was normalised to a value extracted 
from the data as is described in Sect. 2. 

At the particle level, some additional VF+jets NLO 
MC generators were compared to the final results. The 
Powheg [45,55] samples were matched to Pythia6 v6.425 
or Pythia8 v8.165 [56] for parton showering and hadroni- 
sation, while another sample was generated with Mc@Nlo 
v4.06 [44] using Herwig v6.520.2. The Sherpa Menlops 
sample used Sherpa vl.4.1 with its built-in Menlops 
method [4], allowing an NLO+PS matched sample for in- 
clusive W production [57] to be merged with LO matrix 
elements for a W boson and up to five partons using a 
matching scale at 20 GeV. All these NLO samples were 
generated with the CT10 PDF set [58]. 

The Mc@Nlo, Powheg and Alpgen+Herwig sam- 
ples were supplemented with a simulation of QED final- 
state radiation using Photos v2.15.4 [59] and tau de- 
cays using Tauola v27feb06 [60]. The Sherpa samples 
included QED final-state radiation in a different resum- 
mation approach [61] and a built-in tau decay algorithm. 



4 Detector-level comparisons of Monte Carlo 
to data 

The observed and expected detector-level distributions for 
\/d^ in the electron and muon channels are shown in Fig. 2, 
where the MC signal predictions are provided by Alp- 
GEN+Herwig normalised to NNLO predictions [54]. The 
corresponding plots for \fd[, V<h and can be found 
in Appendix A.l. Figure 3 shows the ratio of the second- 
hardest to the hardest splitting scale in each event. Again, 
the sub-leading ratio distributions at detector level are dis- 
played in Appendix A.l. For the hardest clustering in the 
event, generally good agreement between the Alp- 

GEN+Herwig MC predictions and the data is observed. 
The agreement is similar for both the electron and the 
muon channels. 



5 Particle-level extraction 

5.1 Corrections for detector effects 

After subtraction of backgrounds, the detector level dis- 
tributions were corrected ("unfolded") to the final-state 
particle level separately for the two channels, taking into 
account the effects of pileup and detector response. The 
unfolding was performed with the RooUnfold [62] package, 
using a Bayesian algorithm [63], in which Bayes theorem 
was used to derive the particle-level distributions from the 
detector-level distributions, over three iterations. The in- 
put for the algorithm at particle and detector level was 
taken from the Alpgen+Herwig sample as a default. 
Both the MC simulation and data-driven methods were 
used to demonstrate that this iterative Bayesian method 
was able to recover the corresponding particle-level distri- 
butions. 

The selection requirements applied to the event at the 
particle level are: 

• > 20 GeV [l = electron e or muon \i) 

• \rf\ < 2.47 excluding 1.37 < |?f | < 1.52 

• K| < 2.4 

• p!^ lcad > 25 GeV (tiead = highest-px neutrino in event) 

• > 40 GeV 

Only events with exactly one lepton passing the re- 
quirements were taken into account. Leptons were defined 
to include all photon radiation within a cone of AR = 0.1 
around the final-state lepton as suggested in Ref. [64] . All 
lepton requirements were calculated from these combined 
objects. The observables defined in Sect. 1.1 were con- 
structed using all stable particles within a pseudorapidity 
range of |?7 cl | < 4.9 with lifetime greater than 10 ps, ex- 
cluding the lepton and neutrino originating from the W 
boson decay. 

5.2 Weighted combination 

To reduce the impact of imperfect MC modelling of pileup 
effects, whilst optimising the statistical power available, 
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Fig. 2. Uncorrected splitting scale ydo for events passing the W — > ev (left) and W iiv (right) selection requirements. 
The distributions from the data (markers) are compared with the predicted signal from the MC simulation, provided by 
Alpgen+Herwig and normalised to the NNLO prediction. In addition, physics backgrounds, also shown, have been added in 
proportion to the predictions from the MC simulation. The ratio between the expectation and the data is shown in the lower 
plot. The error bars shown on the data are statistical only. 



two different event samples were defined and utilised as 
follows. 

— "Low-pileup sample": exactly one reconstructed ver- 
tex was required in data. The response matrices used 
to unfold the data and the background templates were 
also constructed from events where exactly one recon- 
structed vertex was required. 

— "High-pileup sample": as above, with the difference 
that the number of reconstructed vertices was required 
to be greater than one. 

At large \/~dk, the statistical uncertainty of the high- 
pileup sample is smaller than that in the low-pileup sam- 
ple. However, at small \/dk, the systematic pileup uncer- 
tainty of the low-pileup sample is smaller than that in the 
high-pileup sample. To minimise the overall uncertainty 
on the measurement, the distributions were combined as 
follows. For each bin of the final distribution, the best es- 
timate N was calculated from the bin contents N\, N2 of 
the distributions in the low-pileup and high-pileup sam- 
ples respectively, as 

= N, ■ W l + N 2 ■ W 2 

Wx + W 2 ' U 

The weights Wi for each sample were constructed from 
the inverse of the sum in quadrature of the statistical and 
pileup uncertainties on the low-pileup and the high-pileup 
samples. The evaluation of the pileup uncertainty on each 



sample is described in detail in Sect. 6. The statistical un- 
certainty of the final distribution was calculated assuming 
no correlation between the two samples. 



6 Systematic uncertainties 

To evaluate the impact of a particular source of systematic 
uncertainty at the particle level, the observable considered 
was varied within its uncertainty, the response matrix was 
recalculated taking this variation into account, and the 
new response matrix was used to unfold the data. The 
fractional shift in the resulting unfolded data from nomi- 
nal was interpreted as the systematic uncertainty due to 
that particular effect. The separate sources of uncertainty 
are described in the following. 

The relative systematic uncertainty on the energy scale 
of the topological clusters was evaluated from a combi- 
nation of MC studies and single-pion response measure- 
ments [36] to be 1 ± a x (1 + &/Pt) where pJj! represents 
the transverse momentum of each cluster. The constants a 
and b were determined to be a = 3 (10)% when \r] cl \ < 3.2 
(|ry cl | > 3.2), and b = 1.2 GeV. A shift of the cluster 
energy results in a shift of the distributions to higher or 
lower values. The uncertainty due to the cluster energy 
scale was thus evaluated separately for the low-pileup and 
high-pileup distributions and combined in a weighted lin- 
ear sum. The uncertainty ranges from 5% to 55% for the 
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predictions from the MC simulation. The ratio between the expectation and the data is shown in the lower plot. The error bars 
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splitting scales \fd~k and from 2% to 85% for the y^dk+i/dk 
ratio distributions. 

The lepton trigger, identification and reconstruction 
efficiencies as well as the lepton energy scale and resolution 
were measured in data using Z — ¥ 11 events via the tag- 
and-probe method, as described in Refs. [13,35,65]. The 
uncertainty is less than 3% for the splitting scales \fd~k 
and less than 1% for the \J dk+i/dk ratio distributions. 

The systematic uncertainty due to possible MC mis- 
modelling of pileup was evaluated separately on the low- 
pilcup and high-pilcup distributions. The impact of pileup 
mismodelling on the low-pilcup sample was evaluated by 
varying the requirements on the z-displacement of the 
interaction vertex and the number of associated tracks. 
An additional uncertainty accounts for the possible mis- 
modelling of contributions from adjacent bunch-crossings. 
It was evaluated by comparing two different data-taking 
periods: one in which proton bunches were arranged in 
trains, and the other without bunch trains. The impact of 
pileup mismodelling on the high-pileup sample was eval- 
uated as the fractional difference between the particle- 
level measurements for the low-pileup and the high-pileup 
events, with the statistical uncertainty subtracted in quadra- 
ture. The uncertainty ranges from 1% to 30% for the split- 
ting scales y/dk and is largest for small splitting scales. For 
the a/ dk+i/dk ratio distributions the uncertainty ranges 
from 1% to 15%. 



The uncertainty inherent in the unfolding procedure it- 
self was estimated by reweighting the response matrix in 
the unfolding such that Alpgen+Herwig would accu- 
rately model the distribution under consideration as mea- 
sured from data at reconstruction level. A second varia- 
tion was performed by creating a response matrix from 
Sherpa. The larger effect, per bin, obtained from these 
two estimates of the systematic uncertainty was taken as 
the systematic uncertainty due to unfolding. The uncer- 
tainty ranges between 5% and 55% for the splitting scales 
\/clk, being largest for small values of \[dk and in the 
vicinity of \fd~k rj 15 GeV. For the J d^ + i/dk ratio dis- 
tributions the uncertainty ranges between 1% and 35%. 

The systematic uncertainties on the electroweak and 
top-quark background normalisations were assigned us- 
ing the theoretical uncertainty on the cross section of 
each process under consideration. The uncertainty on the 
multi-jet background normalisation was obtained by vary- 
ing the methods used for extracting this value from data, 
as described in Refs. [13,35]. An additional uncertainty 
was included on the shape of the multi-jet contribution, 
which was derived by comparing data-driven and simula- 
tion estimates of this background contribution. The un- 
certainty ranges from 0.5% to 15% for the splitting scales 
\fd~k and from 1% to 20% for the yjdk+i/dk ratio distri- 
butions. 

The magnitudes of the separate uncertainties for the 
hardest and fourth-hardest splittings are summarised in 
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Fig. 4. Summary of the systematic uncertainties on the measured particle-level distributions for \fd~o (top) and ^fdz (bottom) 
in the W — » eu (left) and W — > \iv (right) channels. 



Figs. 4 and 5, where the statistical errors are also shown. 
Other cases are available in Appendix A. 2. The cluster 
energy scale, pileup, and the unfolding procedure are the 
dominant sources of uncertainty in both the electron and 
muon channels. 

For each uncertainty an error band was calculated, 
where the upper limit is defined as the variation leading 
to larger values compared to the nominal distribution and 
the lower limit as the variation leading to lower values. 
To avoid underestimating the uncertainty in bins where 
statistical fluctuations were large, if both variations led to 
a shift in the same direction the larger difference with re- 
spect to the nominal distribution was taken as a symmet- 
ric uncertainty. Correlations between separate sources of 
systematic uncertainties and between different bins of the 
distributions were not considered. The quadratic sum of 
all systematic uncertainties considered above was taken to 
be the overall systematic uncertainty on the distributions. 
The overall systematic uncertainty ranges between 10% 
and 60% for the \fd~t distributions, being largest for small 
splitting scales and in the vicinity of y/dk ~ 15 GeV. The 
uncertainty is smallest in the vicinity of y/d~k ~ 10 GeV as 
this corresponds to the peak of the distribution and is thus 
less sensitive to scale uncertainties. For the \J d^ + i/dk ra- 



tio distributions the overall systematic uncertainty ranges 
between 5% and 95%, being largest for small values of the 
ratios. The statistical uncertainty on the unfolded mea- 
surement was combined in quadrature with the systematic 
uncertainty to obtain the total uncertainty. 

7 Results 

The different MC simulations in Sect. 3 were compared 
to the data using Rivet [66]. The Fast Jet library [19] was 
used to construct the fcx cluster sequence. Figures 6 and 
7 display the ^/d~k distributions, which have been individ- 
ually normalised to unity to allow for shape comparisons. 

The Alpgen+Herwig MC simulation generally agrees 
very well with the data, as already seen in the detector- 
level distributions. The discrepancies between the MC and 
data distributions are covered by the systematic and sta- 
tistical uncertainties. The Sherpa predictions are almost 
identical to those from Alpgen+Herwig in the hard re- 
gion of the distributions, \fd~k > 20 GeV, where tree-level 
matrix elements are applied. 

All three generators based on NLO+PS methods, i.e. 
Mc@Nlo, Powheg+Pythia6 and Powheg+Pythia8, 
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Fig. 5. Summary of the systematic uncertainties on the measured particle-level ratios for \Jd\/do (top) and \Jd~i]d~2 (bottom) 
in the W — > eu (left) and W — > \iv (right) channels. 



predict significantly less hard activity than that found in 
data. As expected, this effect is strongest for higher mul- 
tiplicities k > 1, where in NLO+PS generators no matrix 
elements are used for the description of the QCD emission. 
It is interesting that they also do not describe well the hard 
tail of the hardest splitting scale y/do, even though they 
are nominally at the same leading-order accuracy as Alp- 
GEN+Herwig and Sherpa in this distribution. This may 
be due to differences in higher-multiplicity parton pro- 
cesses becoming relevant in that region or different scale 
choices in the real-emission matrix element or a combina- 
tion of both. 

In the intermediate region of 10-20 GeV, both Sherpa 
and Mc@Nlo show a similar excess over data in all y/dk. 
For Sherpa it is compensated by an undershoot in the 
very soft region, while for Mc@Nlo the soft region is de- 
scribed well. Powheg+Pythia6 and Powheg+Pythia8 
also agree with data in the soft region, and their deviations 
from each other due to the differences in parton showering 
and hadronisation lie within the experimental uncertain- 
ties. They give identical predictions for the hard region 
of y/do, where both of them should be dominated by an 
identical real-emission matrix element. This confirms the 
expectation that the hard region is dominated by per- 



turbative effects while resummation and non-perturbative 
effects have a large influence in the softer regions. 

The distributions of the ratios y/dk+i/dk ar e displayed 
in Fig. 8. These p robe the probability for a QCD emis- 
sion of hardness \J dk+i given a previous emission of scale 
y/dk. The Herwig parton shower used with both Alp- 
GEN and Mc@Nlo gives the best description of these 
observables. None of the ratio observables are expected 
to be dominated by perturbative effects, since the bulk 
of the events are collec ted n ear the lower threshold at 
\fdk = 20 GeV, and \J dk+\ is always softer than yfdk. 
The Powheg predictions, particularly for the case where 
Powheg is matched to Pythia6, deviate from the data 
in the ratio of the hardest and second-hardest cluster- 
ing, \Jd~\jd~Q. This is the only ratio observable that di- 
rectly probes the NLO+PS matching in Powheg and 
Mc@Nlo. 



8 Conclusions 

A first measurement of the hx cluster splitting scales in 
W boson production at a hadron-hadron collider has been 
presented. The measurement was performed using the 2010 



10 



10- 2 

■a 10- 3 

~6 

10~ 5 

io~ 6 

cd 

9 1 
o 

5 0.5 



> io- 1 


o 




CD 






10- 2 


i 


10~ 3 


is 


10^ 


^> 






10~ 5 




-icr 6 




10- 7 




io- 8 



1.5 
1 

0.5 



ATLAS: Measurement of &t splitting scales 



ATLAS 

Data 2010 

7s = 7 TeV 
j Ldt = 36 ptr 1 

Data (Syst + stat unc.) 
Alpgen+Herwig 
Sherpa (Menlops) 
Mc@Nlo 
Powheg+Pythia6 
Powheg+Pythia8 



tu events at </s = 7 TeV 





10 1 



IO 2 



/^[GeV] 




W -> ev 

— • — Data (Syst + stat unc.) 

Alpgen+Herwig 

Sherpa (Menlops) 

Mc@Nlo 

Powheg+Pythia6 

Powheg+Pythia8 



ATLAS 

Data 2010 -e 
v's = 7 TeV 
/ Ldt = 36 ptr 1 _T 





10 1 



IO 2 



/d7 [GeV] 




ei/ (left) and W 



di [GeV] 

/if (right) channels, shown at particle 



Fig. 6. Distributions of vdu (top) and vdi (bottom) in the W 
level. The data (markers) are compared to the predictions from various MC generators, and the shaded bands represent the 
quadrature sum of systematic and statistical uncertainties on each bin. The histograms have been normalised to unity. 



data sample from pp collisions at y/s = 7 TeV collected 
with the ATLAS detector at the LHC. The data corre- 
spond to approximately 36 pb" 1 in both the electron and 
muon VK-decay channels. 

Results are presented for the four hardest splitting 
scales in a fcx cluster sequence, and ratios of these splitting 
scales. Backgrounds were subtracted and the results were 
corrected for detector effects to allow a comparison to dif- 
ferent generator predictions at particle level. A weighted 
combination was performed to optimise the precision of 
the measurement. The dominant systematic uncertainties 
on the measurements originate from the cluster energy 
scale, pileup and the unfolding procedure. 

The degree of agreement between various Monte Carlo 
simulations with the data varies strongly for different re- 
gions of the observables. The hard tails of the distributions 
are significantly better described by the multi-leg genera- 



tors Alpgen+Herwig and Sherpa, which include exact 
tree-level matrix elements, than by the NLO+PS genera- 
tors Mc@Nlo and Powheg. This also holds true for the 
hardest clustering, y/do, even though it is formally pre- 
dicted at the same QCD leading-order accuracy by all of 
these generators. 

In the soft regions of the splitting scales, larger varia- 
tions between all generators become evident. The genera- 
tors based on the Herwig parton shower provide a good 
description of the data, while the Sherpa and POWHEG+ 
Pythia predictions do not reproduce the soft regions of 
the measurement well. 

With this discriminating power the data thus test the 
resummation shape generated by parton showers and the 
extent to which the shower accuracy is preserved by the 
different merging and matching methods used in these 
Monte Carlo simulations. 
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A Appendices 

A.l Additional detector-level comparisons 
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Fig. 9. Uncorrected splitting scales \fd\_ (left), \J~d2 (middle) and "Jdz (right) for events passing the W — > ev (top) and W — > llv 
(bottom) selection requirements. The distributions from the data (markers) are compared with the predicted signal from the 
MC simulation, provided by Alpgen+Herwig and normalised to the NNLO prediction. In addition, physics backgrounds, also 
shown, have been added in proportion to the predictions from the MC simulation. The ratio between the expectation and the 
data is shown in the lower plot. The error bars shown on the data are statistical only. 
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A. 2 Additional summaries of systematic uncertainties 
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